New Refinement Schemes for Voice Conversion
نویسنده
چکیده
New refinement schemes for voice conversion are proposed in this paper. We take mel-frequency cepstral coefficients (MFCC) as the basic feature and adopt cepstral mean subtraction to compensate the channel effects. We propose S/U/V (Silence/Unvoiced/Voiced) decision rule such that two sets of codebooks are used to capture the difference between unvoiced and voiced segments of the source speaker. Moreover, we apply three schemes to refine the synthesized voice, including pitch refinement with PSOLA, energy equalization, and frame concatenation based on synchronized pitch marks. The satisfactory performance of the voice conversion system can be demonstrated through ABX listening test and MOS grade.
منابع مشابه
New adaptive interpolation schemes for efficient meshbased motion estimation
Motion estimation and compensation is an essential part of existing video coding systems. The mesh-based motion estimation (MME) produces smoother motion field, better subjective quality (free from blocking artifacts), and higher peak signal-to-noise ratio (PSNR) in many cases, especially at low bitrate video communications, compared to the conventional block matching algorithm (BMA). Howev...
متن کاملUsing Context-based Statistical Models to Promote the Quality of Voice Conversion Systems
This article aims to examine methods of optimizing GMM-based voice conversion systems performance in which GMM method is introduced as the basic method for improvement of voice conversion systems performance. In the current methods, due to using a single conversion function to convert all speech units and subsequent spectral smoothing arising from statistical averaging, we will observe quality ...
متن کاملطراحی یک روش آموزش ناموازی جدید برای تبدیل گفتار با عملکردی بهتر از آموزش موازی
Introduction: The art of voice mimicking by computers, has with the computer have been one of the most challenging topics of speech processing in recent years. The system of voice conversion has two sides. In one side, the speaker is the source that his or her voice has been changed for mimicking the target speaker’s voice (which is on the other side). Two methods of p...
متن کاملVoice Conversion Using GMM with Enhanced Global Variance
The goal of voice conversion is to transform a sentence said by one speaker, to sound as if another speaker had said it. The classical conversion based on a Gaussian Mixture Model and several other schemes suggested since, produce muffled sounding outputs, due to excessive smoothing of the spectral envelopes. To reduce the muffling effect, enhancement of the Global Variance (GV) of the spectral...
متن کاملVoIP: A comprehensive survey on a promising technology
The Internet has burgeoned into a worldwide information superhighway during the past few years, giving rise to a host of new applications and services. Among them, Voice over IP (VoIP) is the most prominent one. Beginning more as a frolic among computer enthusiasts, VoIP has set off a feeding frenzy in both the industrial and scientific communities and has the potential to radically change tele...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003